17 research outputs found

    Maximum Entropy Models of Shortest Path and Outbreak Distributions in Networks

    Full text link
    Properties of networks are often characterized in terms of features such as node degree distributions, average path lengths, diameters, or clustering coefficients. Here, we study shortest path length distributions. On the one hand, average as well as maximum distances can be determined therefrom; on the other hand, they are closely related to the dynamics of network spreading processes. Because of the combinatorial nature of networks, we apply maximum entropy arguments to derive a general, physically plausible model. In particular, we establish the generalized Gamma distribution as a continuous characterization of shortest path length histograms of networks or arbitrary topology. Experimental evaluations corroborate our theoretical results

    Graphical models beyond standard settings: lifted decimation, labeling, and counting

    Get PDF
    With increasing complexity and growing problem sizes in AI and Machine Learning, inference and learning are still major issues in Probabilistic Graphical Models (PGMs). On the other hand, many problems are specified in such a way that symmetries arise from the underlying model structure. Exploiting these symmetries during inference, which is referred to as "lifted inference", has lead to significant efficiency gains. This thesis provides several enhanced versions of known algorithms that show to be liftable too and thereby applies lifting in "non-standard" settings. By doing so, the understanding of the applicability of lifted inference and lifting in general is extended. Among various other experiments, it is shown how lifted inference in combination with an innovative Web-based data harvesting pipeline is used to label author-paper-pairs with geographic information in online bibliographies. This results is a large-scale transnational bibliography containing affiliation information over time for roughly one million authors. Analyzing this dataset reveals the importance of understanding count data. Although counting is done literally everywhere, mainstream PGMs have widely been neglecting count data. In the case where the ranges of the random variables are defined over the natural numbers, crude approximations to the true distribution are often made by discretization or a Gaussian assumption. To handle count data, Poisson Dependency Networks (PDNs) are introduced which presents a new class of non-standard PGMs naturally handling count data

    GeoDBLP: Geo-Tagging DBLP for Mining the Sociology of Computer Science

    Full text link
    Many collective human activities have been shown to exhibit universal patterns. However, the possibility of universal patterns across timing events of researcher migration has barely been explored at global scale. Here, we show that timing events of migration within different countries exhibit remarkable similarities. Specifically, we look at the distribution governing the data of researcher migration inferred from the web. Compiling the data in itself represents a significant advance in the field of quantitative analysis of migration patterns. Official and commercial records are often access restricted, incompatible between countries, and especially not registered across researchers. Instead, we introduce GeoDBLP where we propagate geographical seed locations retrieved from the web across the DBLP database of 1,080,958 authors and 1,894,758 papers. But perhaps more important is that we are able to find statistical patterns and create models that explain the migration of researchers. For instance, we show that the science job market can be treated as a Poisson process with individual propensities to migrate following a log-normal distribution over the researcher's career stage. That is, although jobs enter the market constantly, researchers are generally not "memoryless" but have to care greatly about their next move. The propensity to make k>1 migrations, however, follows a gamma distribution suggesting that migration at later career stages is "memoryless". This aligns well but actually goes beyond scientometric models typically postulated based on small case studies. On a very large, transnational scale, we establish the first general regularities that should have major implications on strategies for education and research worldwide

    An authoring tool for educators to make virtual labs

    Get PDF
    This paper focuses on the design and implementation of a tool that allows educators to author 3D virtual labs. The methodology followed is based on web 3D frameworks such as three.js and WordPress that allowed us to develop simplified interfaces for modifying Unity3D templates. Two types of templates namely one for Chemistry and one for Wind Energy labs were developed that allow to test the generalization, user-friendliness and usefulness of such an approach. Results have shown that educators are much interested on the general concept, but several improvements should be made towards the user-friendliness and the intuitiveness of the interfaces in order to allow the inexperienced educators in 3D gaming to make such an attempt.peer-reviewe

    Mathematical Models of Fads Explain the Temporal Dynamics of Internet Memes

    No full text
    Internet memes are a pervasive phenomenon on the social Web. They typically consist of viral catch phrases, images, or videos that spread through instant messaging, (micro) blogs, forums, and social networking sites. Due to their popularity and proliferation, Internet memes attract interest in areas as diverse as marketing, sociology, or computer science and have been dubbed a new form of communication or artistic expression. In this paper, we examine the merits of such claims and analyze how collective attention into Internet memes evolves over time. We introduce and discuss statistical models of the dynamics of fads and fit them to meme related time series obtained from Google Trends. Given data as to more than 200 memes, we find that our models provide more accurate descriptions of the dynamics of growth and decline of collective attention to individual Internet memes than previous approaches from the literature. In short, our results suggest that Internet memes are nothing but fads

    Lifted Message Passing for Satisfiability

    No full text
    Unifying logical and probabilistic reasoning is a longstanding goal of AI. While recent work in lifted belief propagation, handling whole sets of indistinguishable objects together, are promising steps towards achieving this goal that even scale to realistic domains, they are not tailored towards solving combinatorial problems such as determining the satisfiability of Boolean formulas. Recent results, however, show that certain other message passing algorithms, namely, survey propagation, are remarkably successful at solving such problems. In this paper, we propose the first lifted variants of survey propagation and its simpler version warning propagation. Our initial experimental results indicate that they are faster than using lifted belief propagation to determine the satisfiability of Boolean formulas

    How Viral Are Viral Videos?

    No full text
    Within only a few years after the launch of video sharing platforms, viral videos have become a pervasive Internet phenomenon. Yet, notwithstanding growing scholarly interest, the suitability of the viral metaphor seems not to have been studied so far. In this paper, we therefore investigate the attention dynamics of viral videos from the point of view of mathematical epidemiology. We introduce a novel probabilistic model of the progression of infective diseases and use it to analyze time series of YouTube view counts and Google searches. Our results on a data set of almost 800 videos show that their attention dynamics are indeed well accounted for by our epidemic model. In particular, we find that the vast majority of videos considered in this study show very high infection rates

    Lifted Belief Propagation: Pairwise Marginals and Beyond

    No full text
    Lifted belief propagation (LBP) can be extremely fast at computing approximate marginal probability distributions over single variables and neighboring ones in the underlying graphical model. It does, however, not prescribe a way to compute joint distributions over pairs, triples or k-tuples of distant random variables. In this paper, we present an algorithm, called conditioned LBP, for approximating these distributions. Essentially, we select variables one at a time for conditioning, running lifted belief propagation after each selection. This naive solution, however, recomputes the lifted network in each step from scratch, therefore often canceling the benefits of lifted inference. We show how to avoid this by efficiently computing the lifted network for each conditioning directly from the one already known for the single node marginals. Our experimental results validate that significant efficiency gains are possible and illustrate the potential for second-order parameter estimation of Markov logic networks.

    O Scientist, Where Art Thou? Affiliation Propagation for Geo-Referencing Scientific Publications

    No full text
    Today, electronic scholarly articles are available freely at the point of use. Moreover, bibliographic systems such as DBLP, ACM’s Digital Libraries, Google’s Scholar, and Microsoft’s AcademicSearch provide means to search and analyze bibliographic information. However, one important information is typically incomplete, wrong, or even missing: the affiliation of authors. This type of information can be valuable not only for finding and tracking scientists using map interfaces but also for automatic detection of conflict of interests and, in aggregate form, for helping to understand topics and trends in science at global scale. In this work-in-progress report, we consider the problem of retrieving affiliations from few observed affiliations only. Specifically, we crawl ACM’s Digital Libraries for affiliations of authors listed in DBLP. Then, we employ multi-label propagation to propagate the few observed affiliations through out a network induced by a Markov logic network on DBLP entries. We use the propagated affiliations to create a visualization tool, PubMap, that can help expose the affiliations, using a map interface to display the propagated affiliations. Furthermore, we motivate how the information about affiliations can be used in publication summarization
    corecore